Overview

Dataset statistics

Number of variables7
Number of observations3141
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory687.3 KiB
Average record size in memory224.1 B

Variable types

Numeric5
Categorical2

Alerts

state has a high cardinality: 51 distinct valuesHigh cardinality
area_name has a high cardinality: 1876 distinct valuesHigh cardinality
fips is highly overall correlated with stateHigh correlation
urban_influence_code_2013 is highly overall correlated with ci90ub517_2019High correlation
ci90ub517_2019 is highly overall correlated with urban_influence_code_2013High correlation
ci90ub517p_2019 is highly overall correlated with ci90ubinc_2019High correlation
ci90ubinc_2019 is highly overall correlated with ci90ub517p_2019High correlation
state is highly overall correlated with fipsHigh correlation
fips has unique valuesUnique

Reproduction

Analysis started2023-01-17 13:58:36.314632
Analysis finished2023-01-17 14:00:50.262926
Duration2 minutes and 13.95 seconds
Software versionpandas-profiling vv3.6.2
Download configurationconfig.json

Variables

fips
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct3141
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean30388.545
Minimum1001
Maximum56045
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size24.7 KiB
2023-01-17T15:00:50.309090image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1001
5-th percentile5093
Q118179
median29177
Q345081
95-th percentile53063
Maximum56045
Range55044
Interquartile range (IQR)26902

Descriptive statistics

Standard deviation15162.438
Coefficient of variation (CV)0.49895242
Kurtosis-1.0976458
Mean30388.545
Median Absolute Deviation (MAD)12020
Skewness-0.080325149
Sum95450421
Variance2.2989953 × 108
MonotonicityStrictly increasing
2023-01-17T15:00:50.385731image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1001 1
 
< 0.1%
39091 1
 
< 0.1%
39095 1
 
< 0.1%
39097 1
 
< 0.1%
39099 1
 
< 0.1%
39101 1
 
< 0.1%
39103 1
 
< 0.1%
39105 1
 
< 0.1%
39107 1
 
< 0.1%
39109 1
 
< 0.1%
Other values (3131) 3131
99.7%
ValueCountFrequency (%)
1001 1
< 0.1%
1003 1
< 0.1%
1005 1
< 0.1%
1007 1
< 0.1%
1009 1
< 0.1%
1011 1
< 0.1%
1013 1
< 0.1%
1015 1
< 0.1%
1017 1
< 0.1%
1019 1
< 0.1%
ValueCountFrequency (%)
56045 1
< 0.1%
56043 1
< 0.1%
56041 1
< 0.1%
56039 1
< 0.1%
56037 1
< 0.1%
56035 1
< 0.1%
56033 1
< 0.1%
56031 1
< 0.1%
56029 1
< 0.1%
56027 1
< 0.1%

state
Categorical

HIGH CARDINALITY  HIGH CORRELATION 

Distinct51
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Memory size181.1 KiB
TX
254 
GA
 
159
VA
 
133
KY
 
120
MO
 
115
Other values (46)
2360 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters6282
Distinct characters24
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowAL
2nd rowAL
3rd rowAL
4th rowAL
5th rowAL

Common Values

ValueCountFrequency (%)
TX 254
 
8.1%
GA 159
 
5.1%
VA 133
 
4.2%
KY 120
 
3.8%
MO 115
 
3.7%
KS 105
 
3.3%
IL 102
 
3.2%
NC 100
 
3.2%
IA 99
 
3.2%
TN 95
 
3.0%
Other values (41) 1859
59.2%

Length

2023-01-17T15:00:50.452266image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
tx 254
 
8.1%
ga 159
 
5.1%
va 133
 
4.2%
ky 120
 
3.8%
mo 115
 
3.7%
ks 105
 
3.3%
il 102
 
3.2%
nc 100
 
3.2%
ia 99
 
3.2%
tn 95
 
3.0%
Other values (41) 1859
59.2%

Most occurring characters

ValueCountFrequency (%)
A 819
13.0%
N 663
 
10.6%
M 510
 
8.1%
I 501
 
8.0%
T 456
 
7.3%
O 380
 
6.0%
K 331
 
5.3%
L 300
 
4.8%
S 299
 
4.8%
C 277
 
4.4%
Other values (14) 1746
27.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 6282
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 819
13.0%
N 663
 
10.6%
M 510
 
8.1%
I 501
 
8.0%
T 456
 
7.3%
O 380
 
6.0%
K 331
 
5.3%
L 300
 
4.8%
S 299
 
4.8%
C 277
 
4.4%
Other values (14) 1746
27.8%

Most occurring scripts

ValueCountFrequency (%)
Latin 6282
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 819
13.0%
N 663
 
10.6%
M 510
 
8.1%
I 501
 
8.0%
T 456
 
7.3%
O 380
 
6.0%
K 331
 
5.3%
L 300
 
4.8%
S 299
 
4.8%
C 277
 
4.4%
Other values (14) 1746
27.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6282
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 819
13.0%
N 663
 
10.6%
M 510
 
8.1%
I 501
 
8.0%
T 456
 
7.3%
O 380
 
6.0%
K 331
 
5.3%
L 300
 
4.8%
S 299
 
4.8%
C 277
 
4.4%
Other values (14) 1746
27.8%

area_name
Categorical

Distinct1876
Distinct (%)59.7%
Missing0
Missing (%)0.0%
Memory size218.0 KiB
Washington County
 
30
Jefferson County
 
25
Franklin County
 
24
Jackson County
 
23
Lincoln County
 
23
Other values (1871)
3016 

Length

Max length33
Median length28
Mean length14.0156
Min length10

Characters and Unicode

Total characters44023
Distinct characters55
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1452 ?
Unique (%)46.2%

Sample

1st rowAutauga County
2nd rowBaldwin County
3rd rowBarbour County
4th rowBibb County
5th rowBlount County

Common Values

ValueCountFrequency (%)
Washington County 30
 
1.0%
Jefferson County 25
 
0.8%
Franklin County 24
 
0.8%
Jackson County 23
 
0.7%
Lincoln County 23
 
0.7%
Madison County 19
 
0.6%
Montgomery County 18
 
0.6%
Clay County 18
 
0.6%
Monroe County 17
 
0.5%
Union County 17
 
0.5%
Other values (1866) 2927
93.2%

Length

2023-01-17T15:00:50.517217image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
county 3006
46.2%
parish 64
 
1.0%
city 44
 
0.7%
washington 31
 
0.5%
jefferson 28
 
0.4%
franklin 26
 
0.4%
st 26
 
0.4%
jackson 24
 
0.4%
lincoln 24
 
0.4%
madison 20
 
0.3%
Other values (1864) 3219
49.4%

Most occurring characters

ValueCountFrequency (%)
n 4871
11.1%
o 4741
10.8%
t 4041
 
9.2%
u 3583
 
8.1%
C 3418
 
7.8%
y 3389
 
7.7%
3371
 
7.7%
a 2245
 
5.1%
e 2165
 
4.9%
r 1608
 
3.7%
Other values (45) 10591
24.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 34106
77.5%
Uppercase Letter 6509
 
14.8%
Space Separator 3371
 
7.7%
Other Punctuation 31
 
0.1%
Dash Punctuation 6
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 4871
14.3%
o 4741
13.9%
t 4041
11.8%
u 3583
10.5%
y 3389
9.9%
a 2245
6.6%
e 2165
6.3%
r 1608
 
4.7%
l 1266
 
3.7%
i 1262
 
3.7%
Other values (16) 4935
14.5%
Uppercase Letter
ValueCountFrequency (%)
C 3418
52.5%
M 311
 
4.8%
S 283
 
4.3%
P 263
 
4.0%
B 261
 
4.0%
W 225
 
3.5%
L 224
 
3.4%
H 198
 
3.0%
G 156
 
2.4%
D 146
 
2.2%
Other values (15) 1024
 
15.7%
Other Punctuation
ValueCountFrequency (%)
. 27
87.1%
' 4
 
12.9%
Space Separator
ValueCountFrequency (%)
3371
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 40615
92.3%
Common 3408
 
7.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 4871
12.0%
o 4741
11.7%
t 4041
9.9%
u 3583
 
8.8%
C 3418
 
8.4%
y 3389
 
8.3%
a 2245
 
5.5%
e 2165
 
5.3%
r 1608
 
4.0%
l 1266
 
3.1%
Other values (41) 9288
22.9%
Common
ValueCountFrequency (%)
3371
98.9%
. 27
 
0.8%
- 6
 
0.2%
' 4
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 44023
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 4871
11.1%
o 4741
10.8%
t 4041
 
9.2%
u 3583
 
8.1%
C 3418
 
7.8%
y 3389
 
7.7%
3371
 
7.7%
a 2245
 
5.1%
e 2165
 
4.9%
r 1608
 
3.7%
Other values (45) 10591
24.1%

urban_influence_code_2013
Real number (ℝ)

Distinct12
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.2687042
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size24.7 KiB
2023-01-17T15:00:50.577814image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median5
Q38
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.4993727
Coefficient of variation (CV)0.6641809
Kurtosis-1.1168181
Mean5.2687042
Median Absolute Deviation (MAD)3
Skewness0.40191322
Sum16549
Variance12.245609
MonotonicityNot monotonic
2023-01-17T15:00:50.633265image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
2 733
23.3%
1 432
13.8%
6 344
11.0%
8 269
 
8.6%
5 242
 
7.7%
10 189
 
6.0%
9 184
 
5.9%
12 182
 
5.8%
7 162
 
5.2%
4 149
 
4.7%
Other values (2) 255
 
8.1%
ValueCountFrequency (%)
1 432
13.8%
2 733
23.3%
3 130
 
4.1%
4 149
 
4.7%
5 242
 
7.7%
6 344
11.0%
7 162
 
5.2%
8 269
 
8.6%
9 184
 
5.9%
10 189
 
6.0%
ValueCountFrequency (%)
12 182
5.8%
11 125
 
4.0%
10 189
6.0%
9 184
5.9%
8 269
8.6%
7 162
5.2%
6 344
11.0%
5 242
7.7%
4 149
4.7%
3 130
 
4.1%

ci90ub517_2019
Real number (ℝ)

Distinct2125
Distinct (%)67.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3171.0462
Minimum4
Maximum291598
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size24.7 KiB
2023-01-17T15:00:50.701520image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum4
5-th percentile96
Q1431
median990
Q32289
95-th percentile12441
Maximum291598
Range291594
Interquartile range (IQR)1858

Descriptive statistics

Standard deviation10217.041
Coefficient of variation (CV)3.221978
Kurtosis278.33935
Mean3171.0462
Median Absolute Deviation (MAD)713
Skewness13.50658
Sum9960256
Variance1.0438793 × 108
MonotonicityNot monotonic
2023-01-17T15:00:50.775724image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
141 7
 
0.2%
357 6
 
0.2%
625 5
 
0.2%
170 5
 
0.2%
765 5
 
0.2%
92 5
 
0.2%
361 5
 
0.2%
315 5
 
0.2%
164 5
 
0.2%
223 5
 
0.2%
Other values (2115) 3088
98.3%
ValueCountFrequency (%)
4 1
 
< 0.1%
11 1
 
< 0.1%
12 1
 
< 0.1%
13 2
0.1%
15 3
0.1%
16 1
 
< 0.1%
17 2
0.1%
18 3
0.1%
19 3
0.1%
20 2
0.1%
ValueCountFrequency (%)
291598 1
< 0.1%
191382 1
< 0.1%
149400 1
< 0.1%
131963 1
< 0.1%
108106 1
< 0.1%
104650 1
< 0.1%
92682 1
< 0.1%
88586 1
< 0.1%
84229 1
< 0.1%
84164 1
< 0.1%

ci90ub517p_2019
Real number (ℝ)

Distinct501
Distinct (%)16.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean24.675422
Minimum2.9
Maximum70.1
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size24.7 KiB
2023-01-17T15:00:50.855533image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum2.9
5-th percentile9.9
Q116.7
median23
Q330.9
95-th percentile44.6
Maximum70.1
Range67.2
Interquartile range (IQR)14.2

Descriptive statistics

Standard deviation10.743469
Coefficient of variation (CV)0.43539148
Kurtosis0.70020474
Mean24.675422
Median Absolute Deviation (MAD)7
Skewness0.8038354
Sum77505.5
Variance115.42212
MonotonicityNot monotonic
2023-01-17T15:00:50.931567image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
19.3 20
 
0.6%
16 18
 
0.6%
25.5 18
 
0.6%
16.4 18
 
0.6%
21.8 18
 
0.6%
20.1 18
 
0.6%
21.2 18
 
0.6%
15.5 17
 
0.5%
17.3 17
 
0.5%
22.7 17
 
0.5%
Other values (491) 2962
94.3%
ValueCountFrequency (%)
2.9 1
 
< 0.1%
3.8 2
0.1%
3.9 1
 
< 0.1%
4.2 1
 
< 0.1%
4.3 1
 
< 0.1%
4.4 3
0.1%
5 2
0.1%
5.1 1
 
< 0.1%
5.2 1
 
< 0.1%
5.4 2
0.1%
ValueCountFrequency (%)
70.1 1
< 0.1%
65.5 1
< 0.1%
64.7 1
< 0.1%
64.4 1
< 0.1%
64.1 1
< 0.1%
64 1
< 0.1%
62.5 1
< 0.1%
62.1 2
0.1%
62 1
< 0.1%
61.7 1
< 0.1%

ci90ubinc_2019
Real number (ℝ)

Distinct3030
Distinct (%)96.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean60427.026
Minimum27183
Maximum156109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size24.7 KiB
2023-01-17T15:00:51.009353image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum27183
5-th percentile41436
Q150859
median58056
Q367246
95-th percentile89416
Maximum156109
Range128926
Interquartile range (IQR)16387

Descriptive statistics

Standard deviation14922.953
Coefficient of variation (CV)0.24695825
Kurtosis3.4403095
Mean60427.026
Median Absolute Deviation (MAD)8197
Skewness1.3660645
Sum1.8980129 × 108
Variance2.2269451 × 108
MonotonicityNot monotonic
2023-01-17T15:00:51.089673image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
66646 3
 
0.1%
54920 2
 
0.1%
68030 2
 
0.1%
46460 2
 
0.1%
56927 2
 
0.1%
53398 2
 
0.1%
46289 2
 
0.1%
66439 2
 
0.1%
49752 2
 
0.1%
62688 2
 
0.1%
Other values (3020) 3120
99.3%
ValueCountFrequency (%)
27183 1
< 0.1%
29147 1
< 0.1%
29714 1
< 0.1%
30318 1
< 0.1%
31017 1
< 0.1%
31654 1
< 0.1%
31846 1
< 0.1%
32180 1
< 0.1%
32537 1
< 0.1%
32676 1
< 0.1%
ValueCountFrequency (%)
156109 1
< 0.1%
152838 1
< 0.1%
140574 1
< 0.1%
135527 1
< 0.1%
130916 1
< 0.1%
129701 1
< 0.1%
127585 1
< 0.1%
127525 1
< 0.1%
125882 1
< 0.1%
125623 1
< 0.1%

Interactions

2023-01-17T15:00:34.120417image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-01-17T14:58:36.568047image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-01-17T14:59:46.558818image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-01-17T15:00:01.864411image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-01-17T15:00:18.390340image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-01-17T15:00:49.805612image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-01-17T14:58:59.740506image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-01-17T15:00:01.572954image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-01-17T15:00:18.096483image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-01-17T15:00:33.794868image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-01-17T15:00:49.886529image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-01-17T14:59:06.367495image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-01-17T15:00:01.640504image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-01-17T15:00:18.163975image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-01-17T15:00:33.872380image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-01-17T15:00:49.963274image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-01-17T14:59:13.574785image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-01-17T15:00:01.713833image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-01-17T15:00:18.233068image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-01-17T15:00:33.955275image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-01-17T15:00:50.037753image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-01-17T14:59:39.875328image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-01-17T15:00:01.789686image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-01-17T15:00:18.310376image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-01-17T15:00:34.031973image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Correlations

2023-01-17T15:00:51.155743image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
fipsurban_influence_code_2013ci90ub517_2019ci90ub517p_2019ci90ubinc_2019state
fips1.000-0.018-0.042-0.0560.0601.000
urban_influence_code_2013-0.0181.000-0.6140.315-0.3940.206
ci90ub517_2019-0.042-0.6141.000-0.0060.0570.111
ci90ub517p_2019-0.0560.315-0.0061.000-0.8810.222
ci90ubinc_20190.060-0.3940.057-0.8811.0000.234
state1.0000.2060.1110.2220.2341.000

Missing values

2023-01-17T15:00:50.146524image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
A simple visualization of nullity by column.
2023-01-17T15:00:50.224416image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

fipsstatearea_nameurban_influence_code_2013ci90ub517_2019ci90ub517p_2019ci90ubinc_2019
001001ALAutauga County2.0185019.463949
101003ALBaldwin County2.0598717.265149
201005ALBarbour County6.0182249.040122
301007ALBibb County1.0105032.753545
401009ALBlount County1.0249325.859027
501011ALBullock County6.080753.235563
601013ALButler County6.0130741.942839
701015ALCalhoun County2.0499628.551478
801017ALChambers County5.0191738.846692
901019ALCherokee County6.0107629.251189
fipsstatearea_nameurban_influence_code_2013ci90ub517_2019ci90ub517p_2019ci90ubinc_2019
313156027WYNiobrara County12.07026.354455
313256029WYPark County11.070816.465050
313356031WYPlatte County11.022818.562974
313456033WYSheridan County8.056612.271084
313556035WYSublette County10.015810.086158
313656037WYSweetwater County8.087211.187841
313756039WYTeton County8.02016.7111143
313856041WYUinta County8.047911.178321
313956043WYWashakie County11.022617.460194
314056045WYWeston County9.017016.766545